Add support for MANIFEST tailing#4710
Conversation
d620c76 to
376391c
Compare
376391c to
f234600
Compare
7c909c7 to
97c83a3
Compare
There was a problem hiding this comment.
Can this SuperVersion go away later? Do we need to do the thread-local super version trick?
There was a problem hiding this comment.
You are right. If the secondary calls InstallSuperVersion (in another thread), this superversion can go away.
There was a problem hiding this comment.
Is there a data race here? Do we need to do the thread-local super version trick?
There was a problem hiding this comment.
You are right. Let me take a look.
There was a problem hiding this comment.
Who is supposed to call this function? I didn't find it called, except in a unit test.
There was a problem hiding this comment.
Currently it's called only in unit test (#4820). I'm thinking about adding a separate thread calling this function in a future PR.
There was a problem hiding this comment.
Wow, this is complicated... If you asked me before, I would say we don't support manifest switch for now. Anyway since it is already implemented...
There was a problem hiding this comment.
I see. Let's keep it for now.
There was a problem hiding this comment.
Hmm, so far I didn't see how this safely handles manfiest switching. Handling a new manifest, we need to start with empty Version. Maybe I miss something but I don't see it is the way handled.
There was a problem hiding this comment.
We do not keep Versions across different MANIFESTs, I assume you are referring to VersionBuilders which actually save (buffer) the changes of one or multiple VersionEdits?
There was a problem hiding this comment.
Discussed with @siying offline. For this PR, we are not going to worry about MANIFEST switching.
78fa5b2 to
c750120
Compare
c750120 to
98818a5
Compare
Test plan (to be updated) ``` $make clean && make -j32 all check $./db_secondary_test ``` All tests must pass.
98818a5 to
2060fed
Compare
|
Moved to #4899 |
Summary: This PR allows RocksDB to run in single-primary, multi-secondary process mode. The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary. Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`. This PR has several components: 1. (Originally in #4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary. 2. (Similar to #4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue. 3. (Originally in #4710 and #4820). Add implementation of the secondary, i.e. `DBImplSecondary`. 3.1 Tail the primary's MANIFEST during recovery. 3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`. 3.3 Tailing WAL will be in a future PR. 4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code. Pull Request resolved: #4899 Differential Revision: D14510945 Pulled By: riversand963 fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
Summary: This PR allows RocksDB to run in single-primary, multi-secondary process mode. The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary. Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`. This PR has several components: 1. (Originally in facebook#4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary. 2. (Similar to facebook#4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue. 3. (Originally in facebook#4710 and facebook#4820). Add implementation of the secondary, i.e. `DBImplSecondary`. 3.1 Tail the primary's MANIFEST during recovery. 3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`. 3.3 Tailing WAL will be in a future PR. 4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code. Pull Request resolved: facebook#4899 Differential Revision: D14510945 Pulled By: riversand963 fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
Summary: This PR allows RocksDB to run in single-primary, multi-secondary process mode. The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary. Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`. This PR has several components: 1. (Originally in facebook#4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary. 2. (Similar to facebook#4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue. 3. (Originally in facebook#4710 and facebook#4820). Add implementation of the secondary, i.e. `DBImplSecondary`. 3.1 Tail the primary's MANIFEST during recovery. 3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`. 3.3 Tailing WAL will be in a future PR. 4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code. Pull Request resolved: facebook#4899 Differential Revision: D14510945 Pulled By: riversand963 fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
This depends on #4602 and #4745
Test plan (to be updated)
All tests much pass.